Incorporating automatic speech recognition methods into the transcription of police-suspect interviews: factors affecting automatic performance
نویسندگان
چکیده
Introduction In England and Wales, transcripts of police-suspect interviews are often admitted as evidence in courts law. Orthographic transcription is a time-consuming process usually carried out by untrained transcribers, resulting records that contain summaries large sections the interview paraphrased speech. The omission or inaccurate representation important speech content could have serious consequences court It therefore clear investigation into better solutions for police-interview required. This paper explores possibility incorporating automatic recognition (ASR) methods process, with goal producing verbatim without sacrificing police time money. We consider potential viability “first” draft would be manually corrected transcribers. study additionally investigates effects audio quality, regional accent, ASR system used, well types magnitude errors produced their implications context transcripts. Methods Speech data was extracted from two forensically-relevant corpora, speakers accents British English: Standard Southern English West Yorkshire (a non-standard variety). Both high quality degraded version each file transcribed using three commercially available systems: Amazon, Google, Rev. Results System performance varied depending on while accent not found to significantly predict word error rate, distribution substantially across accents, more potentially damaging English. Discussion low rates easily identifiable Amazon suggest incorporation viable, though work required investigate other contextual factors, such multiple different background noise.
منابع مشابه
Incorporating Contextual Phonetics into Automatic Speech Recognition
This work outlines the problems encountered in modeling pronunciation for automatic speech recognition (ASR) of spontaneous (American) English speech. We detail some of the phonetic phenomena within the Switchboard corpus that make the recognition of this speaking style difficult. Phonetic transcribers found that feature spreading and cue trading made identification of phonetic segmental bounda...
متن کاملAutomatic speech recognition performance on a voicemail transcription task
In this paper, we report on the performance of automatic speech recognition (ASR) systems on voicemail transcription. Voicemail is spontaneous telephone speech recorded over a variety of channels; consequently, it is representative of many challenging problems in speech recognition. In the course of working on this task, several algorithms were developed that focus on different components of an...
متن کاملPredicting Automatic Speech Recognition Performance
In spoken dialogue systems, it is important for a system to know how likely a speech recognition hypothesis is to be correct, so it can reprompt for fresh input, or, in cases where many errors have occurred, change its interaction strategy or switch the caller to a human attendant. We have discovered prosodic features which more accurately predict when a recognition hypothesis contains a word e...
متن کاملEfficient Methods for Automatic Speech Recognition
This thesis presents work in the area of automatic speech recognition (ASR). The thesis focuses on methods for increasing the efficiency of speech recognition systems and on techniques for efficient representation of different types of knowledge in the decoding process. In this work, several decoding algorithms and recognition systems have been developed, aimed at various recognition tasks. The...
متن کاملIncorporating named entity recognition into the speech transcription process
Named Entity Recognition (NER) from speech usually involves two sequential steps: transcribing the speech using Automatic Speech Recognition (ASR) and annotating the outputs of the ASR process using NER techniques. Recognizing named entities in automatic transcripts is difficult due to the presence of transcription errors and the absence of some important NER clues, such as capitalization and p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Frontiers in Communication
سال: 2023
ISSN: ['2297-900X']
DOI: https://doi.org/10.3389/fcomm.2023.1165233